AITopics

doi: 10.1109/LRA.2025.3632119

2507.14731

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.53)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.40)

arXiv.org Artificial IntelligenceNov-7-2025

PLLuM: A Family of Polish Large Language Models

Kocoń, Jan, Piasecki, Maciej, Janz, Arkadiusz, Ferdinan, Teddy, Radliński, Łukasz, Koptyra, Bartłomiej, Oleksy, Marcin, Woźniak, Stanisław, Walkowiak, Paweł, Wojtasik, Konrad, Moska, Julia, Naskręt, Tomasz, Walkowiak, Bartosz, Gniewkowski, Mateusz, Szyc, Kamil, Motyka, Dawid, Banach, Dawid, Dalasiński, Jonatan, Rudnicka, Ewa, Alberski, Bartłomiej, Walkowiak, Tomasz, Szczęsny, Aleksander, Markiewicz, Maciej, Bernaś, Tomasz, Mazur, Hubert, Żyta, Kamil, Tykierko, Mateusz, Chodak, Grzegorz, Kajdanowicz, Tomasz, Kazienko, Przemysław, Karlińska, Agnieszka, Seweryn, Karolina, Kołos, Anna, Chrabąszcz, Maciej, Lorenc, Katarzyna, Krasnodębska, Aleksandra, Wilczek, Artur, Dziewulska, Katarzyna, Betscher, Paula, Cieślińska, Zofia, Kowol, Katarzyna, Mikoś, Daria, Trzciński, Maciej, Krutul, Dawid, Kozłowski, Marek, Dadas, Sławomir, Poświata, Rafał, Perełkiewicz, Michał, Grębowiec, Małgorzata, Kazuła, Maciej, Białas, Marcin, Roszko, Roman, Roszko, Danuta, Vaičenonienė, Jurgita, Utka, Andrius, Levchuk, Paweł, Kowalski, Paweł, Prawdzic-Jankowska, Irena, Ogrodniczuk, Maciej, Borys, Monika, Bulińska, Anna, Gumienna, Wiktoria, Kieraś, Witold, Komosińska, Dorota, Krasnowska-Kieraś, Katarzyna, Kobyliński, Łukasz, Lewandowska, Martyna, Łaziński, Marek, Łątkowski, Mikołaj, Mastalerz, Dawid, Milewicz, Beata, Mykowiecka, Agnieszka Anna, Peljak-Łapińska, Angelika, Penno, Sandra, Przybysz, Zuzanna, Rudolf, Michał, Rybak, Piotr, Saputa, Karolina, Tomaszewska, Aleksandra, Wawer, Aleksander, Woliński, Marcin, Wołoszyn, Joanna, Wróblewska, Alina, Żuk, Bartosz, Żarnecki, Filip, Kaczyński, Konrad, Cichosz, Anna, Deckert, Zuzanna, Garnys, Monika, Grabarczyk, Izabela, Janowski, Wojciech, Karasińska, Sylwia, Kujawiak, Aleksandra, Misztela, Piotr, Szymańska, Maria, Walkusz, Karolina, Siek, Igor, Kwiatkowski, Jakub, Pęzik, Piotr

Large Language Models (LLMs) play a central role in modern artificial intelligence, yet their development has been primarily focused on English, resulting in limited support for other languages. We present PLLuM (Polish Large Language Model), the largest open-source family of foundation models tailored specifically for the Polish language. Developed by a consortium of major Polish research institutions, PLLuM addresses the need for high-quality, transparent, and culturally relevant language models beyond the English-centric commercial landscape. We describe the development process, including the construction of a new 140-billion-token Polish text corpus for pre-training, a 77k custom instructions dataset, and a 100k preference optimization dataset. A key component is a Responsible AI framework that incorporates strict data governance and a hybrid module for output correction and safety filtering. We detail the models' architecture, training procedures, and alignment techniques for both base and instruction-tuned variants, and demonstrate their utility in a downstream task within public administration. By releasing these models publicly, PLLuM aims to foster open research and strengthen sovereign AI technologies in Poland.

large language model, machine learning, natural language, (19 more...)

2511.03823

Country:

North America (1.00)
Europe > Poland (1.00)
Asia (1.00)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Law Enforcement & Public Safety (1.00)
Information Technology > Security & Privacy (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)

arXiv.org Artificial IntelligenceMay-6-2025

Task-Oriented Semantic Communication in Large Multimodal Models-based Vehicle Networks

Du, Baoxia, Du, Hongyang, Niyato, Dusit, Li, Ruidong

Task-oriented semantic communication has emerged as a fundamental approach for enhancing performance in various communication scenarios. While recent advances in Generative Artificial Intelligence (GenAI), such as Large Language Models (LLMs), have been applied to semantic communication designs, the potential of Large Multimodal Models (LMMs) remains largely unexplored. In this paper, we investigate an LMM-based vehicle AI assistant using a Large Language and Vision Assistant (LLaVA) and propose a task-oriented semantic communication framework to facilitate efficient interaction between users and cloud servers. To reduce computational demands and shorten response time, we optimize LLaVA's image slicing to selectively focus on areas of utmost interest to users. Additionally, we assess the importance of image patches by combining objective and subjective user attention, adjusting energy usage for transmitting semantic information. This strategy optimizes resource utilization, ensuring precise transmission of critical information. We construct a Visual Question Answering (VQA) dataset for traffic scenarios to evaluate effectiveness. Experimental results show that our semantic communication framework significantly increases accuracy in answering questions under the same channel conditions, performing particularly well in environments with poor Signal-to-Noise Ratios (SNR). Accuracy can be improved by 13.4% at an SNR of 12dB and 33.1% at 10dB, respectively.

information, large language model, machine learning, (16 more...)

doi: 10.1109/TMC.2025.3564543

2505.02413

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
Asia > Singapore (0.04)
Asia > China > Hong Kong (0.04)
(6 more...)

Genre:

Research Report > New Finding (0.66)
Personal > Honors (0.46)

Industry:

Information Technology > Security & Privacy (0.92)
Transportation (0.69)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

arXiv.org Artificial IntelligenceMar-9-2025

PythonPal: Enhancing Online Programming Education through Chatbot-Driven Personalized Feedback

Palahan, Sirinda

The rise of online programming education has necessitated more effective, personalized interactions, a gap that PythonPal aims to fill through its innovative learning system integrated with a chatbot. This research delves into PythonPal's potential to enhance the online learning experience, especially in contexts with high student-to-teacher ratios where there is a need for personalized feedback. PythonPal's design, featuring modules for conversation, tutorials, and exercises, was evaluated through student interactions and feedback. Key findings reveal PythonPal's proficiency in syntax error recognition and user query comprehension, with its intent classification model showing high accuracy. The system's performance in error feedback, though varied, demonstrates both strengths and areas for enhancement. Student feedback indicated satisfactory query understanding and feedback accuracy but also pointed out the need for faster responses and improved interaction quality. PythonPal's deployment promises to significantly enhance online programming education by providing immediate, personalized feedback and interactive learning experiences, fostering a deeper understanding of programming concepts among students. These benefits mark a step forward in addressing the challenges of distance learning, making programming education more accessible and effective.

chatbot, pythonpal, student, (13 more...)

doi: 10.1109/TLT.2025.3545084

2503.16487

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
South America > Brazil (0.04)
Oceania > New Zealand (0.04)
(13 more...)

Genre: Instructional Material > Online (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

The GuardianFeb-19-2025, 12:13:58 GMT

EU accused of leaving 'devastating' copyright loophole in AI Act

"What I do not understand is that we are supporting big tech instead of protecting European creative ideas and content." The EU's AI Act, which came into force last year, was already in the works when ChatGPT, an AI chatbot that can generate essays, jokes and job applications, burst into public consciousness in late 2022, becoming the fastest-growing consumer application in history. ChatGPT was developed by OpenAI, which is also behind the AI image generator Dall-E. He would like legislation to fill that gap, but said it would take years, after the European Commission's decision last week to withdraw the proposed AI Liability Act. "It might be getting very difficult.

ai act, machine learning, natural language, (21 more...)

The Guardian

Country: Europe (0.94)

Industry:

Law (1.00)
Government > Regional Government > Europe Government (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.75)

#artificialintelligenceFeb-3-2023, 15:46:12 GMT

The UK rolls back controversial plans to open up text and data mining regulations • TechCrunch

The U.K. Government is seemingly backtracking on plans that would have allowed text and data mining "for any purpose," plans designed to position the U.K. as a "global AI superpower." The news emerges following months of blowback from creative industries concerned about what impact the rules might have on protected works. Text and data mining, for the uninitiated, is an essential component of just about every AI application, allowing researchers and developers to leverage disparate datasets to train their algorithms. But gaining access to a sufficient amount of data is not a straight-forward endeavor, given that data is often owned by organizations or individuals that might not want third-parties to have access to their data. Or, they may only make said data available under a commercial license, making it prohibitively expensive to harness.

controversial plan, data mining, metals & mining, (15 more...)

Country: Europe > United Kingdom (0.16)

Industry:

Government (1.00)
Materials > Metals & Mining (0.85)
Law > Statutes (0.85)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Data Science > Data Mining (0.85)

#artificialintelligenceDec-2-2022, 15:30:23 GMT

We are tearing up creative rights to feed a flawed Whitehall obsession with AI

There's no reason you should have ever heard of Simon Squibb, the "chief purpose officer of the Purposeful Project". Mr Squibb, who describes himself as an "Elon Musk wanna-be" in his Twitter profile, is one of those tirelessly energetic mid-life influencers who proliferate on the petri dish of LinkedIn. Displaying the sort of enthusiasm that Matt Hancock reserves for a Bushtucker Challenge, Mr Squibb is on a mission. "I want to fix the education system", he says. This fix entails removing something that many of us consider quite an important part of the education system: the learning part.

government, squibb, whitehall obsession, (12 more...)

Country: Europe > United Kingdom (0.31)

Industry:

Law (0.74)
Information Technology > Services (0.36)
Government > Regional Government > Europe Government (0.31)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

#artificialintelligenceJul-27-2022, 17:40:09 GMT

All Change (but Not Just Yet) When It Comes to AI and IP

Artificial Intelligence (AI) has the potential to transform many aspects of life and the UK government has recognized that it is important to review IP laws to ensure that they evolve and promote innovation in this fast-paced area of technology. That was the motivation behind a recent UKIPO consultation which reported earlier this week. With regards to patent protection for AI-devised inventions, the report concluded that no changes are required to UK patent law, at least for the time being. At present, despite claims of certain parties and the international court case relating to the DABUS system which its promotors sought to name as the inventor on patent applications in a number of countries, there is no evidence of AI currently having the capacity to invent. Rather, the general consensus from respondents was that AI technology cannot, at least at present, invent without human assistance.

ai and ip, protection, text and data mining, (10 more...)

Country: Europe > United Kingdom (0.39)

Genre: Research Report (0.36)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Government > Regional Government > Europe Government > United Kingdom Government (0.39)

Technology: Information Technology > Artificial Intelligence (1.00)

#artificialintelligenceJun-29-2022, 12:04:16 GMT

UK to boost AI development by removing data mining hurdles – TechCrunch

The U.K. is planning to tweak an existing law to allow text and data mining "for any purpose," in a move that's designed to boost artificial intelligence (AI) development across the country. The announcement constitutes part of a broader strategy to "level up" AI and transform the U.K. into what it calls a "global AI superpower" -- and part of this will involve reassessing existing intellectual property (IP) laws. Following a two-month consultation period where stakeholders from across the industrial spectrum were asked for input, including rightsholders, academics, lawyers, trade organisations and businesses, the U.K.'s Intellectual Property Office (IPO) today published its response and confirmed what will (and won't) be changing moving forward. Text and data mining (TDM) is pivotal to the development of new AI applications, allowing researchers and businesses to copy and harness disparate datasets to train their algorithms. However, gaining access to enough relevant data has inherent challenges -- the data is often owned by third-parties that may only want to make data available under a commercial license, if they make it available at all.

ai development, data mining, rightsholder, (11 more...)

Country: Europe > United Kingdom (0.05)

Industry: Law > Intellectual Property & Technology Law (0.92)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Data Science > Data Mining (0.85)

Communications of the ACMOct-26-2021, 05:10:37 GMT

Text and Data Mining of In-Copyright Works

Text and data mining (TDM) uses statistical analysis tools to extract new knowledge from large quantities of text or data for purposes by finding patterns, discovering relationships, and analyzing semantics. It is used in a wide variety of fields from biomedical research to digital humanities. Japan has enacted laws to allow TDM research copying. These lawsuits grew out of the Google Book Search Project (GBS). GBS is a corpus of millions of digital books to improve its search technologies that Google developed after making a deal with the University of Michigan in 2004 to scan all eight million books in its library's collections. In return, Michigan got back from Google digital copies of the books it scanned.

exception, google, tdm researcher, (14 more...)

Communications of the ACM

Country:

North America > United States > Michigan (0.45)
Asia > Japan (0.36)
North America > United States > California > Alameda County > Berkeley (0.15)

Industry: Law > Litigation (0.91)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Data Science > Data Mining (0.61)